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Abstract: Internet exchange points (IXPs) emerged to remedy the deficiency of peering connections among autonomous 
systems (ASes). IXPs play an important role in reducing the cost of transit connections over the Internet. This work 
attempts to study the popularity of IXPs over the Internet. This work consists of two main parts. The first part 
is a measurement study of multihistorical snapshots of IXPs. These historical data have been harvested for different 
European IXPs with emphasis on Amsterdam (AMS-IX) IXP. In the second part, two nonlinear autoregressive exogenous 
(NARX) back-propagation neural network models (BPNN) have been implemented to predict the following: the future 
traffic volume that the AMS-IX IXP will transit and the number of participant networks that will use the AMX-IX 
IXP services. We utilized AMX-IX IXP collected data to implement these models. Our results show that ASes have 
understood the important roles that IXPs play on the Internet. Moreover, the traffic size that is carried by the IXPs 
is rapidly growing. Finally, our implemented NARX BPNN models show a considerable degree of fidelity, in which we 
obtained more than 99% in regression value with negligible error. 

Key words: Nonlinear autoregressive exogenous (NARX), internet exchange points (IXPs), back-propagation neural 
network models (BPNN), Amsterdam internet exchange point (AMS-IX), traffic volume 

1. Introduction 

The Internet is defined as a network of networks. These networks vary in size and purpose, such as CDNs, 

ISPs, and clouds. Nevertheless, these networks are called autonomous systems (ASes) in the top-level view 
of the Internet. Usually ASes are connected with each other to route traffic to their final destinations. The 
relationships between different ASes are specified by different economical models. These models have three 
main types: customer-provider model, settlement-agreement peering model, and sibling model. 

A settlement-agreement peering model is implemented free of charge between two ASes. This type of 
relationships can reduce the cost of transit traffic that parents ASes charge their children in a customer-provider 
model. Peering relationships are widely implemented across the Internet. To reduce the cost of implementing 
a private peering link between two ASes, third party companies deploy a public location and infrastructure for 
ASes to connect and implement peering relationships. These public locations allow ASes to implement more 
than one peering relationship without extra connection links. These companies are called internet exchange 
points (IXPs). 

In the past few years, the number of IXPs has grown to more than 400 over the Internet (http://Peeringdb.com). 
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These IXPs vary in sizes, number of connected ASes, cost of connection, type of infrastructure technology, and 
the number of offered ports and speeds. IXPs route Tbps (http://ams-ix.net/technical/statistics) all over the 
world. This fact made them an important component of internet structure. 

Research on IXPs is important since they have a noticeable effect on the fidelity of internet modeling, 

AS graph construction, and network application simulations. Usually these studies will miss questions like 
the growing rate of the IXPs in the Internet, and whether the IXPs are important component of internet 
infrastructure or not. 

In this work we try to answer such questions. The contribution of this paper is twofold. First, we 
conducted a measurement study of the growing history of the IXPs to test their popularity in the Internet over 
time. Second, we utilized the data provided by AMS-IX IXP as our case study (AMS-IX is the world largest 
IXP, with more than 600 ASes and peak traffic of over than 3 Tbps (http://ams-ix.net/technical/statistics)) to 
build two prediction models. The two models are built utilizing time series analysis and nonlinear autoregressive 
exogenous model (NARX). The first model is used to predict the growth rate in the number of participant ASes 
to AMS-IX IXP. The second model predicts the growth rate in traffic load that passes through this IXP. 

The data used in this work have been collected from the way-back machine project (https://archive.org/web/). 
We collected snapshots of the AMS-IX IXP website over the past 16 years. Moreover, snapshots of another 60 
IXPs have been harvested. 

The rest of this paper is organized as follows. Section 2 introduces related works that have been conducted 
in the IXPs area; overviews the IXP, time series, and NARX neural network; and states the proposed prediction 
models. Section 3 demonstrates our conducted experiment and its parts, and discusses the results. Section 4 
concluded our work. 

1.1. Overview and the proposed method 

This section starts by presenting a literature review of works that have been conducted in the area of IXPs. 
Subsequently, IXP, NARX neural network is overviewed. Finally, our proposed prediction model will be 
demonstrated. 

2. Related work 

IXPs have been studied intensively in the past few years [1,2]. Research has been done on inferring IXPs over 
the Internet [3], their locations between AS tiers [4], their IP addresses [5], and correcting the IP-AS mapping 
process that IXPs addresses affect [6]. Nevertheless, peering relationships inferring through measurement studies 
[7,8] or passive and active BGP probing [9] have gained the heaviest emphasis. These studies demonstrated the 
important roles of IXPs in the infrastructure of the Internet and their impact on internet modeling. 

In recent years, researchers started to study IXPs from different points of view. In [10], researchers con¬ 
ducted a measurement study to test the correctness of IXPs information in the PeeringDB (http://Peeringdb.com) 
database. This database has been utilized in internet modeling, IP-to-AS mapping, and AS graph construction. 

The authors found that PeeringDB information is up to date. In [11], an anatomical measurement study of a 
European IXP has been conducted. Properties like traffic load, number of peering relationships, and infras¬ 
tructure have been demonstrated. The authors showed that the number of peering relationships that has been 
found is very far from reality. They discovered more than 200K new peering links. In [12], a measurement 
study of an African IXP has also been conducted. This study demonstrated the traffic load, links, and peering 
relations of an IXP that is located in Africa. All of these measurements studies emphasized inferring relation- 
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ships and traffic loads. In [13], the authors generated a pricing model for IXPs from an economical point of 
view. However, neither these studies nor the internet modeling research studied the growing rate, size, or the 
popularity of IXPs against private peers. 

Finally, we have to mention the work that has been conducted in [14]. It was the first work that attempted 
to study an IXP over time. SOX IXP has been used as a source of data. Peering matrix development and traffic 
load passing through this IXP have been studied and shown over the years. This work shows how the IXP has 
changed from transiting a few Mbps to more than Tbps. However, the SOX IXP that has been studied is not 
a large IXP. In addition, the time period and the number of snapshots used did not cover a large time span as 
we did in this work. 

This work differs from the mentioned research in three ways. First, most of the IXP measurements 
utilize data provided by IXPs. However, in our work we harvested data from different sources by crawling IXP 
websites, PeeringDB, and internet archive for historical data. This required us to program a web crawler that 
can adjust itself dynamically to changes of websites over time. Second, different measurement studies use new 
data that IXPs collect without considering historical data that can give insights into how to develop the IXP 
infrastructure such as the number of ports it needs. Finally, in this work we attempted to generate models that 
can be utilized to predict traffic volumes and number of participated ASes, which will be helpful in infrastructure 
planning and upgrading processes. 


2.1. Internet exchange points (IXPs) 

IXPs are defined as physical internet traffic exchange nodes. ISPs and other ASes exchange traffic between 
themselves through these nodes. Network access points (NAPs) are the predecessors of IXPs [15]. IXPs are 
constructed mainly of network switches. According to the PeeringDB database, there are more than 400 IXPs 
around the world. They vary in size, fees, and policy. Figure 1 shows the distribution of IXPs over the world’s 
continents. This figure demonstrates that Europe dominates with the number of deployed IXPs. However, the 
USA, as a country, dominated with 90 IXPs. Nevertheless, Amsterdam IXP (AMS-IX) is the largest IXP in the 
world, with more than 600 connected ASes and peak traffic of more than 3 Tbps. AMS-IX IXP was established 
in 1990 as one of the first IXPs in Europe. This is one of the reasons that motivated us to adopt AMS-IX IXP 
as our case study. 



Africa Europe Asia North America Latin America 

Continents 

Figure 1. Distribution of IXPs over the world’s continents. 
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When ASes connect to IXPs, they implement one of three types of policy. These policies are used when 
other ASes attempt to construct a peering relationship with them. These policies are open, selective, and 
restrictive. An open policy is defined as the ability of any AS to construct a peering relation with any other 
open policy AS. On the other hand, selective and restrictive policies require agreements between ASes before 
constructing any relationship links [16]. 


2.2. Time series analysis and NARX 

Time series analysis is defined as a collection of methods and procedures that find coherency among events 
occurring over a period of time. Subsequently, it can predict the occurrence of new events. Time series analysis 
can be divided into two main paradigms: statistical and intelligent methods. Statistical methods [17], such 
as fractional difference model, structure model, and Bayesian method are easy to understand and implement. 
However, they are not tractable in complex time series and complex evolvements [18]. On the other hand, 
intelligent methods [19,20], such as NARX, multilayer perceptron’s with back propagation, and neural networks 
are better for time series analysis with missing and incomplete data. Moreover, they can modulate nonlinear 
problems [21]. 

In this work, NARX has been utilized to generate our prediction networks model. The following section 
will demonstrate NARX and our model. 

2.3. The proposed prediction models 

The nonlinear autoregressive exogenous model (NARX) is a nonlinear autoregressive model that has exogenous 
inputs. Exogenous means that there is one or more feedback input to the model. NARX can be implemented 
utilizing a back propagation neural network (BPNN). The use of a BPNN model requires the old output values 
to be fed back to the input. 

To generate a prediction model utilizing NARX, training data must be applied to the model. The input 
and the output data in our model are the total number of ASes and the traffic load. The input and the output 
data are the same type. To generate a training matrix and output vector for the NARX model, a training 
matrix is created utilizing the collected data. To generate such a matrix, we have to find a correlation between 
the data. Figure 2 shows the changes in the average traffic load of AMS-IX IXP. Eq. (1) has been used to 
normalize the data, where fj, represents the mean value of the data set. It can be observed from Figure 2 that 
the changes in traffic load are not increasing over the time periods. It is flipping between ups and downs in 
a nonconsistent manner. Moreover, we can observe that the traffic load is decreasing for a maximum of three 
consecutive snapshots before it goes up, while it is increasing for a maximum of four consecutive snapshots 
before it goes down again. This correlation can be used to generate our training matrix. Figure 3 shows how we 
divided the collected data to generate a training matrix and an output vector. As we can observe, four columns 
in the input matrix are presented. The first column in the matrix is the feedback or the previous predicted 
output of the prediction process. For the first input row of the input matrix, where there is no feedback before, 
an initial value could be used. In this way, training data (the input matrix and the output vector) have been 
generated utilizing traffic volume data snapshots and total number of participant ASes. The next process is to 
implement a NARX neural model for prediction. 


Normalized value = 


Original value — g, 
Max{ all Values} 


(1) 
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Training Data 



Neural Network 



3. Experiment and result discussion 

Our experiment consists of three main parts: data harvesting, modeling, and analysis. We will demonstrate 
each part in the following subsections. Figure 4 shows a flowchart of the experiment’s parts. 


3.1. Data harvesting, modeling, and analysis 

3.1.1. Data harvesting 

The data utilized in this study have been collected from three main sources. The first source is the PeeringDB 
database. We implemented a web crawler to harvest IXP information, such as the autonomous system numbers 
(ASNs) of the IXP clients and their countries. We used these data to generate results, such as Figure 1. The 
second source of data is the AMS-IX IXP website. However, their historical data could not be harvested from 
their main website. To collect such information, the way-back-machine (WBM) project has been utilized. The 
WBM is a project that archives internet websites. The WBM saves snapshots of webpages in any website in 
the Internet over the years. The WBM project has archived 562 snapshots over 17 years “between 1996 and 
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2013” of the AMS-IX IXP website. Another web crawler has been written to collect the traffic volume, number 
of connected networks, and their names from the AMS-IX IXP website. However, the AMS-IX IXP website has 
changed its design and configurations four times over a period of 16 years. This change required us to change 
the crawler code in correspondence. This process allowed us to collect four different snapshots for every year 
over a period of 16 years. These data have been used to implement a BPNN predictor model. 



Figure 4. Flowchart of the experiment’s parts. 


The third source of information is the peering matrix from the Euro-IX IXP website. The peering matrix 
in this website consists of over than 70 IXPs and their members. We programmed a crawler to collect information 
from this peering matrix through the WBM. The WBM has 322 snapshots of the Euro-IX IXP website. We 
collected 13 snapshots with our crawler. Each snapshot of one year during the period 2001-2014. Again the 
change in this website design and configurations required us to program a dynamic web crawler that adapted 
itself to these changes. This crawler was designed and written in Python. Finally, Table 1 provides a summary 
of all collected data. 


Table 1. A summary of collected data. 


Parameter 

Value 

Number of ASes in PeeringDB 

3382 

Number of IXPs in PeeringDB 

507 

Number of IXPs with participants 

416 

Number of IXPs in Euro-IX 

60 

Number of ASes in Euro-IX 

4587 

Number of used snapshots of AMS-IX 

54 
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3.1.2. Data processing 

The analysis procedure consists of two phases: data preprocessing and data normalization. Data preprocessing 
is the process of purifying the data from duplication or misleading information. For example, after collecting the 
ISP names of AMS-IX IXP members, we found that many of them had changed their names, such as Easynet 
Nederland (was Wirehub! Internet), or they had been acquired by other ISPs, such as IBM Global Network 
acquired by AT&T and Global Crossing acquired by Level 3 Communication. This preprocessing step cannot 
be handled by ASNs since their ASNs also have been changed. We have created a list with the names of all 
ASes in the oldest snapshot. Subsequently, we wrote a script to search all snapshots to find if there is any 
information about the name of the new network company that acquired them, such as “Formally Known” and 
“WAS”. If this information cannot be found, we searched the Internet to find names of other networks that 
acquired them. 

The data normalization phase starts off with normalizing the harvested snapshots from the WBM project. 
Data normalization is done utilizing Eq. (1). However, not all the data have been normalized. Neither Euro-IX 
IXP snapshots data nor PeeringDB data have been normalized before using them. 

3.1.3. Neural network models 

To deploy our neural network model, a back propagation neural network (BPNN) has been utilized. Two models 
have been implemented. The first model is responsible for predicting the traffic volume of AMS-IX IXP over 
a period of 16 years. The input data consist of monthly average traffic volume snapshots. We have used 192 
points as input data. The data have been divided into three sections: 55% training, 20% testing, and 25% 
validation data. These points have been arranged with feedback as mentioned earlier. The second model is 
responsible for predicting the number of AMS-IX IXP’s connected members. We did not use monthly snapshots 
over the 16-year period in this model. The reason is that the WBM project did not keep monthly snapshots. 
Four points with three months’ space of every year have been recorded. Fifty points have been recorded and 
used as input to our second BPNN model. The division of the input data is similar to that of the first model. 

The proposed BPNN NARX models consist of three layers: input, output, and hidden layers. The number 
of nodes in the hidden layer consists of 10 nodes. The proposed system has been written in Python scripting 
language. Table 2 shows the configuration of our two BPNN NARX models. 

Table 2. Configuration parameters of the two BPNN NARX models. 


Parameter 

Value 

Maximum number of epochs to train 

800 

Maximum time to train in seconds 

Infinity 

Performance goal 

0 

Maximum validation failures 

18 

Initial 

1 x 10" 3 

H decrease factor 

0.1 

Li\ increase factor 

10 

Number of nodes in the hidden layer 

10 

Number of layers 

3 


3.2. Results analysis 

The analytical results of this work consist of three parts: IXPs’ popularity, historical growth, and the IXP 
prediction model. The following IXPs’ popularity 
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The distribution of AS-IXP’s participant degree is shown in Figure 5. This figure has been generated 
using the PeeringDB crawled data. The figure shows that the distribution follows a power law, in which the 
majority of ASes are members of only a small number of IXPs, and few ASes are members of a large number 
of IXPs, such as Google (82 IXPs open policy), Akamai (93 IXPs open policy), Yahoo (34 IXPs open policy), 
Amazon (48 IXPs open policy), and Microsoft (77 IXPs open policy). The reason is that content distribution 
and media networks attempt to reduce their transit traffic through deploying peering relationships as much as 
they can. They can fulfill this requirement through implementing an open policy with many IXPs to allow 
other providers to peer with them. On the other hand, high level internet service providers (ISPs), such as 
Level 3 communication (0 IXPs), China Telecom (4 IXPs with selective policy only), and AT&T (19 IXPs with 
selective policy only) attempt to reduce this fact to increase the cost of transit traffic of their customer ISPs. 



Nevertheless, Figure 6 shows the CDF of the number of IXPs that ASes are connected to. We can observe 
that less than 50% of the collected ASes are connected to only one IXP. Moreover, more than 20% are connected 
to two IXPs and more than 19% are connected to more than 3 IXPs. This CDF shows the popularity of IXPs. 
In addition, it shows how ASes understood the role of IXPs in the Internet. However, two things should be 
mentioned. First, the number of ASes harvested and utilized to generate these figures is less than 5K. This 
number is small compared to the number of ASes collected in AS graph studies [22]. Second, we have harvested 
the number of private peering links that these ASes deployed; we found that the number of private links is more 
than 20K; this number can be compared with the number of links generated with IXPs ( 21K ). This shows that 
a lot of ISPs still believe in implementing their own links without the help of third party companies. 

Figure 7 shows the historical growth of the number of connected networks (members) of AMS-IX IXP. 
Figure 7 demonstrates two main points. First, the number of networks that connected to AMS-IX IXP exceeded 
IK AS; most of these networks are members of AMS-IX IXP for the past 16 years; others started from five or six 
years and are still connected. Second, any network that attempted to connect to AMS-IX IXP is still connected 
and only a few networks disconnected. Nowadays, AMS-IX IXP has 680 connected networks. This shows that 
more than 60% of the total number of networks connected to AMS-IX IXP is still connected. One thing to 
be mentioned here is that 120 ASes or networks that have disappeared from the member list of AMS-IX IXP 
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have been acquired by other companies or they have gone out of business. This fact increases the percentage 
to more than 75%. This percentage shows that the IXP members tasted the advantages of IXPs and public 
peering relationships. 




SB 

.o 


Connected Networks 

Figure 7. Historical growth in the number of connected network of AMS-IX IXP. 


Finally, Figure 8 shows the growth rate in the total number of connected networks against the number 
of leased ports that ASes rent. This figure has been generated from the AMS-IX IXP harvested snapshots over 
13 years. The figure shows that the number of leased ports that ASes utilized increases rapidly. It also can be 
observed that the total number of connected ports is 100% more than the total number of ASes. The reason 
for this is that the traffic volume of the AMS-IX IXP members across it has increased in a way that enforced 
them to lease more ports to reduce the load on IXP connected links. 
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Figure 8. Growth rate in the total number of connected networks against the number of leased ports that ASes rent. 

3.2.1. Historical growth 

Figure 9 shows the growth rate in 60 IXPs registered in the Euro-IX IXP website. The figure is based on four 
snapshots per year for 13 years. The snapshots that have been used in this figure are 50 snapshots. From each 
snapshot, we have collected the number of IXPs that any AS belongs to. Then we repeated this step for all 
ASes over the 50 harvested snapshots. As we mentioned, the number of ASes that have been found in Euro-IX 
IXP is approximately 5K. Subsequently, we have calculated the changes in the number of IXPs that any AS 
belongs to over all the snapshots from the first time the AS occurs. We plotted these changes in this figure. 



Figure 9. Growth rate in 60 IXPs registered in Euro-IX website. 

We can observe from the figure that most of the ASes over all the snapshots have connected or discon¬ 
nected with only one IXP. In addition, we can observe that other ASes increased the number of connected IXPs 
massively with four, five, six, and seven new IXPs. On the other hand, we can see that the reduction rate is 
a maximum of two IXPs. This shows that the growing rate is higher than the decaying rate in the number of 
connected IXPs. 
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Figure 10 shows the historical changes that occurred in the number of IXPs that a certain AS belongs 
to. We can observe that the number of changes follows a power-law distribution with few ASes participating 
in many ASes over the years. However, we also can observe that a few ASes reduced the number of IXPs they 
connect to. We think that ASes attempt to connect to an IXP with a massive number of connected networks 
with open access. In other words, ASes will move from IXPs that have a small number of ASes or a small 
number of ASes with open access policy. 



3.2.2. BPNN NARX models 

As mentioned, two BPNN NARX models have been implemented. The first one predicts the growth rate in 
the traffic volume of AMS-IX IXP. The second model predicts the growth in the number of connected networks 
to AMS-IX IXP. Figure 11 shows a comparison between the real and predicted traffic volume capacity. It can 
be observed that the accuracy of the prediction model is high. Figure 12 shows a comparison between the real 
number of connected networks and the prediction one. We can observe from these figures that converting the 
data into four input columns simulated the real data that we have collected. 



Figure 11. Comparison between real and predicted traffic volume capacity. 
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Figure 12. Comparison between real and predicted number of connected networks. 

Finally, to show their accuracy, Figures 13 and 14 show the regression output and the mean square error 
values of our first and second models. 


Training R=0 99944 x iq" 4 Validation R=0 98882 



x 10 ‘ 4 Test R=0 99043 x tf 4 All: R=0 99625 



Figure 13. Regression output of the two BPNN NARX models. 



Figure 14. MSE of NARX BPNN. 
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We can observe that the regression value of our models exceeded 99% and the MSE was less than 10 ~ 9 . 
This error value shows that these models have a high accuracy and can be used to predict the traffic volume 
and number of connected networks in the future. 

4. Conclusion 

Due the important role they play in the infrastructure of the Internet, IXPs have been studied intensively in the 
past few years. Our contribution in this work can be divided into two parts. In the first part, a measurement 
study of the developments of IXPs has been conducted. We have harvested historical data of a number of 
European IXPs with emphasis on AMS-IX IXP. In the second part, we have proposed two BPNN-NARX 
models: one model is used to predict the growing rate in traffic volume; the other model is used to predict the 
growth rate in the number of participated ASes in AMS-IX IXP. Our results show that ASes understood the 
important roles that IXPs are playing in the Internet. In addition, the number of IXPs that ASes attempt to 
connect to is increasing rapidly. Finally, our NARX BPNN models scored a regression value of 99% with a MSE 
of 10 " 9 . 
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